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Early detection of breast cancer cells can be predicted through a precise 
feature extraction technique that can produce efficient features. The 
application of Gabor filters, gray level co-occurrence matrices (GLCM) and 
other textural feature extraction techniques have proven to achieve 
promising results but were often characterized by a high false-positive rate 
(FPR) and false-negative rate (FNR) with high computational complexities. 
This study optimized textural features for mass classification in digital 
mammography using the weighted average gravitational search algorithm 
(WA-GSA). The Gabor and GLCM features were fused and optimized using 
WA-GSA to overcome the weakness of the textural feature techniques. With 
support vector machine (SVM) used as the classifier, the proposed algorithm 
was compared with commonly applied techniques. Experimental results 
show that the SVM with WA-GSA features achieved FPR, FNR and 
accuracy of 1.60%, 9.68% and 95.71% at 271.83 s, respectively. Meanwhile, 
SVM with Gabor features achieved FPR, FNR and accuracy of 3.21%, 
12.90% and 93.57% at 2351.29 s, respectively, while SVM with GLCM 
features achieved FPR, FNR and accuracy of 4.28%, 18.28% and 91.07% at 
384.54 s, respectively. The obtained results show the prevalence of the 
proposed algorithm, WA-GSA, in the classification of breast cancer tumor 
detection. 
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1. INTRODUCTION 


Digital mammography is a powerful technique that helps in the diagnosis of breast cancers at 
premature stages [1]. The early detection of breast cancer helps prevent the growth to a complicated stage 
which could lead to the need for surgeries. This forestalls unnecessary biopsies and radiation therapies by 
proper screening and abnormality detection; thus, increases the likelihood of patient’s survival [2], [3]. The 
malignancy can be found in patients in the presence of masses and microcalcifications in the breast region. 
The successful analysis of breast cancer relies on features extracted from the cancer suspicious areas and 
classification of the features using a classifier or the combinations of classifiers [4], [5]. The enrichment and 
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extraction of regions of interest (ROTI) in digital images is the foremost challenging task in the computerized 
diagnosis of breast cancer using mammographic images. This is due to low contrast results which are 
sometimes complicated to handle two major concerns namely false-positives and false-negatives [6]. The 
false-positive results could lead to the surgeries of benign (noncancerous) conditions. Meanwhile, the 
false-negative results could allow the early-stage disease to develop to a more complicated stage with fewer 
rates of survival. 

Computer-aided detection and diagnosis (CAD) can be utilized on the digital images to assist 
radiologists analyze the overall images, and emphasize the likely areas of concern for further analysis [7]. 
The two significant phases of a CAD system for mass detection are the detection of suspicious ROI in 
mammogram images and the classification of these ROI into the masses (malignant) or normal cases. One of 
the critical stages in the classification of ROI is feature extraction, which absolutely affects the classification 
rate [2]. An assortment of computer-aided methods has recently been examined and it produced different 
levels of success for the analysis of the digital mammograms [8]. 

Gravitational search algorithm (GSA) is an efficient optimization algorithm created based on mass 
interactions and the law of gravity [9]. The GSA uses the Newtonian gravity’s theory, and its explorer agents 
are the set of masses. In GSA, there is a system of masses such that every mass in the system is matched with 
the location of other masses using the gravitational force. This force is a way of passing information between 
different masses [10]. 

Gabor filter is a method which has been largely used for a textural description in different imaging 
applications [3]. Gabor filters decompose an image into different scales and orientations and analyze texture 
patterns efficiently. Mammograms have high texture, and Gabor filters are suitable for the texture analysis of 
mammograms as well [11]. 

The gray level co-occurrence matrices (GLCM) is a second-order statistical method that calculates 
the frequency of pixel pairs having the same grey-levels in an image and applies additional knowledge 
obtained using spatial pixel relations [12]—[15]. GLCM has been largely used for image texture analysis [16]. 
It was applied as a feature extraction technique to compute the textural measure. The co-occurrence matrix 
reveals the grey level spatial dependency along with different angular relationships, horizontal, vertical and 
two diagonal directions on an image. The co-occurrence matrix embeds the distribution of grayscale 
transitions using edge information. Since most of the information required for computing threshold values are 
embedded in GLCM, it has emerged as a basic yet efficient technique [17]. 

Khan et al. [18] proposed improved Gabor features for mass classification in mammography. The 
study introduced optimization of Gabor filter banks based on an increasing clustering algorithm and particle 
swarm optimization (PSO). SVM with Gaussian kernel as a fitness function for PSO was utilized and 
assessed on 1024 ROI extracted from a digital database for screening mammography (DDSM) using four 
performance measures (i.e., accuracy, area under receiver operating characteristic (ROC) curve, sensitivity, 
and specificity). The outcomes showed that the proposed technique improves performance and diminishes the 
computational cost. Shirazi and Rashedi [10] projected a feature weighting for cancer tumor detection in 
mammography images using a GSA with GLCM as the feature extraction technique. The GSA was used as a 
tool for the optimization of the features weighting (FW) and tuning the classifier (k-NN). The weighted 
features and the tuned k-NN classifier were utilized for discovering tumors. The obtained results showed a 
good efficiency of GSA-based FW-kNN classification for breast cancer tumor detection. Hussain et al. [19] 
presented a comparison of different Gabor features for mass classification in mammography. The study 
explored the performance of six different Gabor feature extraction approaches for mass classification 
problems. The technique employed Gabor filter banks for extracting multiscale and multi-orientation texture 
features which represent structural characteristics of masses and normal dense tissues in mammograms. The 
feature extraction approaches were evaluated over the ROI extracted from MIA’S database. The support 
vector machine (SVM) was used to effectively classify the generated unbalanced datasets. The experimental 
outcome revealed that the proposed method was able to reduce the false positives and false negatives. 
However, it is computationally expensive and time-consuming. 

Reliable CAD systems are developed for robust feature extraction techniques for mass detection. To 
improve the efficiency and accuracy of a CAD system, it is crucial to extract the most discriminative features 
efficiently [2]. All the aforementioned textural feature extraction techniques developed for efficient mass 
detection in early breast cancer CAD diagnosis are quite robust. However, they are restricted by high 
tendency rate of false positive and false negative diagnosis. Also, there are mostly associated with high 
computational complexities. To address these limitations, this study aims to develop a hybrid technique for 
textural feature extraction in breast cancer diagnosis. The specific objectives are: i) to fuse the Gabor and 
GLCM textural feature extraction technique using the weighted average gravitational search algorithm 
(WA-GSA) technique, ii) to apply the developed hybrid technique to some selected mammographic images 
for training and testing, and iii) to compare the performance of the new hybrid technique with the existing 
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ones based on accuracy, false-positive rate, false-negative rate, and computational time. Next, section 2 
showcases the detailed research method while the results are given in section 3. Section 4 concludes the study 
and contains the recommendation for further studies. 


2. RESEARCH METHOD 

Digital mammographic images including normal, benign, and cancerous cases were first acquired. 
Then, the images were pre-processed to obtain the desired image quality for further processing. This is 
followed by obtaining the ROI boundaries, edges and curves of the pre-processed images and the result are 
subsequently segmented. Gabor filter texture segmentation and extraction is performed before classification. 
The features of the obtained dataset are extracted based on the developed algorithm before being classified 
for training and testing. 


2.1. Image acquisition 

Digital mammographic images were acquired from the image retrieval in medical applications 
(IRMA) database of the Aachen University of Technology Germany. This dataset provides 9,852 
radiographs, which include normal, benign, and cancerous cases. Each study includes two images of each 
breast, acquired in craniocaudal (CC) and medio-lateral (ML) views that have been scanned from the 
film-based sources by four different scanners with a resolution between 50 and 42. For this study, seven 
hundred (700) digital mammographic images were selected. Four hundred and twenty (420) of the acquired 
images were used for training while two hundred and eighty (280) of the images were used for the testing. 
The test dataset comprises 93 normal mammogram images and 187 abnormal mammogram images: out of 
which 92 were benign mammogram images and 95 cancerous mammogram images. 


2.2. Pre-processing 

A series of pre-processing steps were applied to improve the image quality for further processing. 
The acquired images were passed unto different pre-processing techniques and image resizing was 
performed. Thereafter, the removal of the black and white border, breast boundary detection, artefacts 
elimination (labels) and background, removal of the pectoral muscle and contrast adjustment. The 
pre-processing techniques were Otsu’s method for thresholding, Moore’s Algorithm for tracing boundary on 
threshold images, and difference of Gaussians (DOG) as contrast enhancement technique. The outputs at 
different stages of the pre-processing stages using DOG is depicts in Figure 1, where Figure 1(a) is the 
original image, Figure 1(b) is black and white removal, Figure 1(c) is the flipped image, Figure 1(d) is the 
traced boundary, Figure 1(e) is the pectoral muscle removal, Figure 1(f) is the enhanced image CLAHE, 
Figure 1(g) is the enhanced image DOG, Figure 1(h) is the Gaussian blur image, Figure 1(1) ROI_image, 
Figure 1(j) is the Shannon entropy image, and Figure 1(k) is the morphological filtering image.The 
methodologies associated with the stages are discussed elsewhere [20]. 


2.3. Segmentation 

The ROI boundaries, edges and curves of the pre-processed images were located and segmented 
using Gaussian blurring, Otsu’s thresholding, and automatic cropping technique. Fuzzy C-means clustering 
was applied to the ROI and the evaluations of the textual image were done using a GLCM. Shannon’s 
Entropy-based thresholding was performed on the GLCM texture and morphological filtering were finally 
applied. The output produced the excepted segmented image. Gabor filter was also used for the segmentation 
process in the case of the Gabor filter feature extraction. The steps involved in fuzzy C-means image 
segmentation are highlighted elsewhere [21] and are: 
— Initialize the cluster centers c; and set t = 0. 
— Initialize the fuzzy partition memberships functions 44;; according to (1). 


-1 
lej-eal 
kij = | Ling (tat) a) 
í ( N kjem) 60 


— Lett = t + 1 and compute new cluster centres c; using (2). 


N k 
i Èj=1 Hij Xj 
i N k 
ÈXj=1 Hij 


(2) 


— Repeat steps 2 to 3 until convergence. 
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Figure 1. Summary of the pre-processing stage (a) original image, (b) black and white removal, 
(c) flipped image, (d) traced boundary, (e) pectoral muscle removal, (f) enhanced image CLAHE, 
(g) enhanced image DOG, (h) Gaussian blur image, (i) ROI_image, (j) Shannon entropy image, and 
(k) morphological filtering image 


2.4. Gabor filter texture segmentation and extraction 

The texture is considered to be the most important property for masses representation since it is 
useful for characterizing micro patterns like edges, lines, and spots. Gabor filter has been used with different 
scales and orientations to extract texture-based features [19], [22]. The filters have been applied in the 
various domains of computer science such as face recognition, gesture recognition, and optical character 
recognition. Gabor is highly efficient in the CAD system as mammogram contains micropattern which has a 
lot of texture [3]. A 2-D Gabor filter defined as a Gaussian function modulated by an oriented complex 
sinusoidal wave can be described as given in (3) [23]: 


g(x,y) = —— elt?) +9" /05))] eerw (3) 


210xOy 


where x and y are expressed as given in (4) and (5) 
X = xcos@ + ysind (4) 
y = xsinO + ycosé (5) 


Oxy and oy are the scaling parameters (i.e., they define the neighbourhood of a pixel where the weighted 
summation takes place), W is the central frequency of the complex sinusoid and @ € [0,71] is the orientation 
of the normal to the parallel stripes of the Gabor function. 

Gabor filter was used to segment the mammogram dataset before feature extraction was carried out. 
The segmentation of the images using the Gabor filter is portrayed in Figure 2. The input image i(x, y) was 
expected to be composed of two textures. 


Input s Segmented 
Image —> | h(x, y) bai m (x,y) Ip (x) Threshold ac 
i (x,y) is(x,y) 


Gabor Magnitude Gaussian Segmentation 
Pre-filter Operator Post-filter 


n(x, y) 
mpx, y) 


l 


Figure 2. Segmentation using Gabor filter 
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The image was first passed through a Gabor pre-filter with impulse response h(x, y). The Gabor 
function h(x, y) (see (6)) is a complex sinusoid centred at a frequency (U, V) and modulated by a Gaussian 
envelope g(x, y) (in (7)). The spatial extent of the Gaussian envelope is determined by the parameter og. 


h(x,y) = g(x, y)exp[—j2m(Ux + Vy)] (6) 
and 
1 2 2 
g(x,y) = p |- >| 7) 


Further, the 2D Fourier transform of h(x, y) is 

H(u,v) =G(u-—U,v—V) (8) 
where 

G(u,v) = exp|—21707 (u? + v?)| (9) 
is the Fourier transform of g(x,y). The parameters (U, V, og) determine h(x, y). From (7) and (8), the 


Gabor function is basically a bandpass filter centered around frequency (U, V), with bandwidth determined 
by ag. It was assumed for simplicity, that the Gaussian envelope g(x, y) is an asymmetrical function. The 


output of the pre-filter stage i, (x,y) is the convolution of the input image with the filter response 

in@y) = h(x, y) *i@y) (10) 
The magnitude of the first-stage output is computed in the second stage as expressed in (11): 

m(x,y) = lin y)| = Ih(x,y) * i y)I (11) 


A low-pass Gaussian post-filter g,(x,y) is applied to pre-filter output m(x,y) yielding the post-filtered 
image 


My (x,y) = m(x, y) * gpx, y) (12) 
where 
1 ( 2 2) 
Jp(x, y) = Te exp | por | (13) 


Generally, i, (x,y) is referred to as the pre-filtered image, m(x, y) as the pre-filtered output, and m, (x,y), 
the post-filtered output. Finally, the segmented image i,(x, y) from the Gaussian-post-filter output mp (x, y) 
was achieved by applying a threshold t to m,(x,y); points above the threshold are assigned to one texture, 
and points below to the other. 

The segmented image was extracted. The Gabor filter-based technique was used to extract the 
textural features of the mammography images. Gabor filters are orientation and frequency selective filters, 
relying upon different parameters, specifically, a frequency f and an orientation 9. Forty different Gabor 
filters (5 frequency x 8 orientation) were used in this study. The segmented ROI were each partitioned into 
sub-windows then Gabor filter bank was applied on each window separately. The moments were computed 
(mean, standard deviation, skewness) based features from the magnitude of Gabor filter bank responses to 
obtain a one thousand and eighty (1080) texture feature set. 


2.5. GLCM algorithm for feature extraction 

In this study, fourteen textural features namely angular second moment (energy), correlation, 
contrast, entropy, homogeneity, inverse difference moment, the sum of entropy, sum variance, the sum of 
average, different average, difference variance, different entropy, information measures correlation 1 and 
information measures correlation 2 were employed. The GLCM was computed for the four directions around 
the pixel of interest. The method used an 8x8 sliding window to allow the co-occurrence calculation and 
extraction of the textural features. The fourteen textural features for the co-occurrence matrices considered 
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four directions (0°, 45°, 90° and 135°) and distance of one (d=1). The fourteen texture features were 
computed from the GLCM to obtain 56 textures feature set (4 directions x 1 distance x 14 texture features) 
for each sliding window. 


2.6. Normalization of features 
The textural features of the mammography images extracted by the respective GLCM and Gabor 
were normalized using the min-max technique by the aid of (14) and (15): 


f'gicm—min(f' gicm) 


max(f’gicm)—min(f'gicm) 


Totem a 


(14) 


f'gabor~Min(f' gabor) 
max(f’gabor)—min(f' gabor) 


fgabor = (15) 


where f'gicm and f'gabor are the features obtained using GLCM and Gabor respectively, while fgicm and 
foabor are the normalized features. 


2.7. Optimal feature selection 
The general formulation of an optimal feature selection problem used in this study are: 


an DOLF) (16) 
Subject to: 

0 <A(x(t), z(t), w(t), Kbest, F%) < 1 (17) 

F7€ F, 

0<w(t)<1 


m;(t) if m;(t) = w(t) 


w(t) otherwise m;(t) < w(t) (8) 


w(t) -f 


where x(t) € R” and z(t) € R” are the vectors of the GLCM and Gabor state variables, respectively. The 
entire state vector is denoted as y = [x z], where x is the set of feature vector of GLCM, z is the set of 
feature vector of Gabor filter bands. The problem is defined on the feature’s horizon F, = [F£ F £]. 

The feature-dependent control variables Kpest E R” and possibly the final feature Ff are decision 
variables for optimization. The goal of the optimization is to find the optimal set of decision variables to 
minimize the objective function Ø, that is, O(y(Ff)). The search space for finding the optimum is restricted 
by constraints, which describe, appropriate weight and feature parameter requirements to be fulfilled during 
fusion at feature selection level. The study considered weight constraint w(t) and feature constraint h(...). 


2.8. Weighted average gravitational search algorithm 
WA-GSA was used to fuse the normalized features. This was achieved by modifying the fitness 
function of the GSA using (19): 


fit © = LL WCE) fotem,(t) + (1 — WE) favor; Œ) (19) 


where w(t) is the weight of the GLCM features and (1 — w(t)) is the weight of the Gabor features. 

Feature level fusion can be done either at the feature extraction stage or at the feature selection 
stage. The weighted average method has been used previously as fusion technique and was able to suppress 
noises existing in the source feature images. At the same time, it also suppresses the salient features that 
should be preserved for the fused feature image, thereby, producing a low contrast result. In other to reduce 
the challenge of the weighted average and to have a balanced feature fusion, WA-GSA was developed and 
employed in this study. 

WA-GSA fusion technique was used at the feature selection phase which dealt with the selection 
and combination of GLCM and Gabor features to remove redundant and irrelevant features, the objective is 
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to reduce the computational burden of feature concatenation by choosing optimal subsets of features from the 
two textural features. WA-GSA technique used the following parameters for fusing GLCM features and 
Gabor features: a maximum iteration of 100, number of agents (N) as the maximum size of the feature vector. 
The algorithmic steps for the WA-GSA technique used to achieve the fusion are highlighted: 

— Step 1: Set fganor=Gabor features and fgicm=GLCM features 


— Step 2: Agents initialization: The positions of the N number of agents are initialized randomly. 
Xi = (x, ..., x, 0., x2) fori=1,2,...,N (20) 


w(t) = rand 0<w(t)<1 
xf represents the positions of the it? agent in the dt” dimension, while n is the space dimension. w (t) 
represent the initial weight. 
— Step 3: Fitness evolution and best fitness computation. The fitness evolution is performed by evaluating 
the best and worst fitness for all agents at each iteration. For minimization problems, the best and worst 
finesses are expressed as given in (21): 


fiti) = Dh WE) fotem,(t) + (1 — w()) foador; (t) (21) 
Subject to 


m;(t) if mj,(t) = w(t) 


w(t) otherwise m;(t) < w(t) (22) 


w(t) -f 


where w(t) is the weight set for fabor (Gabor features extracted from mammography image) and 
1 — w (t) is the weight set for fgicm (Glcm features extracted from mammography image). 


best(t) = min fit;(t) worst(t) = max fit;(t) for jé1,...,N 
For maximization problems best and worst fitness are: 
best(t) = max fitj(t) jel,...,N worst(t)=minfit;@) je1,....N 


fit;(@) represents the fitness value of the i” agent at iteration t , best(t) and worst(t) represents the best 
and worst fitness at iteration t. 
— Step 4: Gravitational constant (G) computation. The gravitational constant G(t) is computed using (23). 


G(t) = Goe*/) (h) (23) 


Go and q are initialized at the beginning and was reduced with time to control the search accuracy. T is the 
total number of iterations. 

— Step 5: Calculation of the masses of the agents. Gravitational and inertia masses for each agent are 
calculated at iteration t. Masses in GSA depend upon the fitness value of agents. 


Maj = Mpi = Mu = Mi i=1,2,...,N 
_ _ fitj-worst(t) = m(t) 
m;(t) = best(t)—worst(t) M; = TL mo) (24) 


Where M;; and M,,; are inertia and passive gravitational masses of it” agent respectively and M, j is the active 
gravitational mass of jt agent. fit; is the fitness value of it? agent. 
— Step 6: Calculation of the agent’s accelerations: The acceleration of the agents are calculated using (25). 


Fe) 


d a 
ai ois Milt) 


(25) 


F(t) is the total force acting on ith agent calculated as shown in (26), 
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Kbest is the set of first K agents with the best fitness value and biggest mass. Kbest will reduce in each 
iteration and at the end only one agent applying force to the other agents. Force on it? agent by d agent 
mass during iteration t is computed using the (27). 


FEC) = G(t). (PE +c) (xt - x40) (27) 


Rij(t) 


R;;(t) is the Euclidian distance between two agents i and j at iteration t. G (t) is the gravitational constant 
calculated using (9) while £ is a small constant. 


— Step 7: Velocity and positions of agents: The velocity update equation for the agents is defined as given 
in (28). 


ve(t +1) = rand; x v¢(t) + af(t) (28) 


rand is random variable in interval [0,1]. vê (t) and v@(t + 1) are the velocity of it" individual during the 
iteration t and t + 1, respectively. The position update equation for individuals is defined as (29). 


xf(t+1) = xf) + vit +1) (29) 


x@(t) and x#(t + 1) are the position of it? individual during the iteration t and t + 1, respectively. The 
velocity of individuals is updated during each iteration. Due to changes in velocity, every individual updates 
his position. 

— Step 8: Repeat steps 2 to 8. Steps 2 to 8 are repeated until the iterations reach their maximum limit. The 
best fitness value at the final iteration is computed as the global fitness while the position of the 
corresponding agent at specified dimensions is computed as the global solution of that particular problem 
which resulted in the feature selected, FA (t). 


frusea(t) = FG(t) (30) 
Where ffusea (t) is the fused features at feature selection level. 


2.9. Support vector machine for classification 

For the classification of features extracted by Gabor and GLCM techniques, SVM was used. In this 
study, a binary classification problem, where textural features can be characterized as either cancer region or 
normal was considered. The SVM finds an optimal hyper-plane that can separate the data belonging to 
different classes with large margins in high dimensional space [24]. The margin is defined as the sum of 
distances to the decision boundary (hyper-plane) from the nearest points (support vectors) of the two classes. 
SVM formulation is based on statistical learning theory and has attractive generalization capabilities in linear 
as well as non-linear decision problems. SVM takes classification decisions using (31) for optimal 
hyperplane with maximum margin: 


g(x) =w'x+ wo =0 (31) 
where x is the feature descriptor and w and wọ are unknown parameters, which are computed using training 
samples {(x;,¥;)|1 < i < N}{(xi, yi) where y; E {+1,—1} is the class label; the computation involves the 


solution of an optimization problem based on large margin theory [25]. Once the optimal hyper-plane has 
been computed, the classification of a test sample x is performed using (32): 


g(x) = Diss Aix! x + Wo (32) 
where A; are Lagrange multipliers and N, is the number of support vectors i.e., the training samples 


corresponding to non-zero À;. In case the data samples belonging to two classes are not linearly separable, the 
Kernel trick is used. Using a kernel function, the function g(x) is expressed in (33): 


g(x) = FS, iyi K (xi x) + Wo (33) 
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where K(x;,x) is the kernel function that expresses the inner product of data. The samples in the higher 
dimensional space user-defined parameter C is used to control the misclassified penalty or error in the new 
formulation. The misclassification penalty or error is controlled with a user-defined parameter C 
(regularization parameter, controlling the trade-off between the error of SVM and margin maximization), and 
is tied with the kernel. There are several kernels available to be used e.g., linear, polynomial, sigmoid, and 
radial basis function (RBF) [18]. In this study, the RBF kernel was used as given in (33) and C is set to 2000. 


K(x; x) = exp(—yllx; — xll?),y > 0 


The y is the width of the kernel function. RBF kernel is now tied with two parameters y and C. In summary, 


Figure 3 summarizes the schematic of the methodology and the process flow. 
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Figure 3. Proposed flow for fused optimization 


2.10. Performance evaluation 


The overall performance of the techniques under study was evaluated based on recognition 
accuracy, false positive rate (FPR), false negative rate (FNR) and computation time. A confusion matrix was 
used to determine the values of the performance metrics. It contains “true positive (TP), false positive (FP), 


false negative (FN) and true negative (TN)” as expressed in (35)-(37), respectively. 


False positive rate (FPR) = = 
, FN 
False negative rate (FNR) = TP4FN 
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(34) 


(35) 


(36) 
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TP+TN 


Accuracy = ——————__—_ 
Y = IP+TN+FP+FN 


(37) 


3. RESULTS AND DISCUSSION 

The presented techniques were implemented using MATLAB R2018 on Windows 10 64-bit 
operating system, Intel®Core™ i5-2540M CPU @2.60GHz central processing unit (CPU), 6 GB random 
access memory (RAM) and 500 GB hard disk drive. The application was designed to run across different 
platforms. A total number of 280 mammographic images were used to test the techniques. Three (3) 
techniques were evaluated; This includes WA-GSA which combines Gabor and GLCM optimum features, 
Gabor features and GLCM features. The result of the feature extractions was presented and evaluated for 
each of the techniques. Table 1 presents the contingency table for SVM classification using WA-GSA, Gabor 
and GLCM features. The mammogram dataset comprises of 280 images out of which 93 were Normal and 
187 were abnormal (Benign/Malignant). 

Table 1 summarizes the WA-GSA, Gabor and GLCM features as shown, the SVM techniques with 
WA-GSA features properly classified 84 of the normal mammogram datasets as normal as against the 
recorded 81 for Gabor and 76 for GLCM. WA-GSA algorithm equally performed better in terms of the 
corresponding false negatives as recorded. Similarly, the classification of the abnormal dataset performed 
better by correctly identifying 184 of the abnormal datasets as abnormal as against the 181 and 179 recorded 
for Gabor and GLCM, respectively. Table 1 shows the superiority of WA-GSA over Gabor and GLCM in 
terms of the confusion matrix. 


Table 1. Contingency table for classification using WA-GSA, Gabor and GLCM features 


Techniques WA-GSA Gabor GLCM 
Predicted Class Predicted Class Predicted Class 
Normal Abnormal Normal Abnormal Normal Abnormal 
Aital Class Normal (93) 84 (TP) 9 (FN) 81 (TP) 12 (FN) 76 (TP) 17 (FN) 
Abnormal (187) 3 (FP) 184 (TN) 6 (FP) 181 (TN) 8 (FP) 179 (TN) 


Weighted average gravitational search algorithm (WA-GSA); Gray level co-occurrence matrices (GLMC) 


Furthermore, Table 2 depicts the performance of WA-GSA, Gabor and GLCM feature for validation 
measures; FPR, FNR, accuracy, and computation time. The result obtained in Table 3 shows that at y = 3 
and C=1,000; the SVM technique achieved FPR, FNR and accuracy of 1.60%, 9.68% and 95.51% at 271.83 
seconds, respectively, for WA-GSA features. Also, the SVM technique achieved FPR, FNR and accuracy of 
3.21%, 12.90% and 93.57% at 2351.29 seconds, respectively, for Gabor features. Similarly, the SVM 
technique achieved FPR, FNR and accuracy of 4.28%, 18.28% and 91.07% at 384.54 seconds, respectively, 
for GLCM features. 


Table 2. WA-GSA, Gabor and GLCM features for SVM classification 


Technique FPR (%) FNR (%) Accuracy (%) Time (sec) 

WA-GSA 1.60 9.68 95.71 271.83 
GABOR 3.21 12.90 93.57 2351.29 
GLCM 4.28 18.28 91.07 384.54 


Table 2 presents the WA-GSA, Gabor, GLCM features for the SVM classification. As observed, the 
WA-GSA features outperformed both the Gabor and GLCM features in terms of FPR, FNR, accuracy and 
computation time. Additionally, validation measures namely FPR, FNR, accuracy and computation time 
exhibit superiority of WA-GSA features over both Gabor and GLCM features. Figures 4 and 5 depict the 
relationship between the computation time and gamma value as well as the accuracy and gamma values, 
respectively. 

Table 3 summarizes the statistical comparison between WA-GSA, Gabor and GLCM features. 
Inferential statistical analysis using paired sampled t-test was done to analyze the result obtained for 
WA-GSA and Gabor features, and for WA-GSA and GLCM features with accuracy, FPR, FNR and 
computation time reveal that the test of significance of the aforementioned metrics evaluated at 95% 
confidence level shows that there was a significant difference between WA-GSA features and Gabor features 
as well as WA-GSA features and GLCM features at P<0.05. The t-test result validates the fact that the SVM 
technique with WA-GSA features outperformed the SVM technique with both Gabor features and GLCM 


Int J Elec & Comp Eng, Vol. 12, No. 5, October 2022: 5001-5013 


Int J Elec & Comp Eng ISSN: 2088-8708 O 5011 


features in terms of accuracy, FPR, FNR and computation time in the classification of mass in digital 
mammography. 
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Figure 4. Variation of computation time versus Figure 5 Variation of accuracy versus gamma value 


gamma value 


Table 3. Statistical comparison between WA-GSA, Gabor and GLCM feature on paired sampled t-test at 0.05 
significance level 


Techniques Measures Mean difference t value p-value 

WA-GSA vs GLCM Accuracy 4.21 11.59 .000 
FPR -3.00 -7.52 .002 

FNR -6.66 -6.08 .004 

C-Time -187.69 -4.13 014 

WA-GSA vs Gabor Accuracy 2.00 4.81 .009 
FPR -1.50 -4.79 .009 

FNR -3.01 -4.81 .009 

C-Time -1849.11 -13.02 .000 


*Computation time (C-Time); False negative rate (FNR); False positive rate (FPR) 


The outcome of this study justifies the combination of optimized features of GLCM and Gabor 
using WA-GSA technique. The combined textural features from WA-GSA technique achieved a more 
discriminating and computationally efficient feature which reduces both high false positive and negative rates 
associated with existing techniques in digital mammography. The result achieved in this study signifies that 
the gamma value is an important parameter of SVM which has a great influence on the accuracy and 
complexity of the classification models as supported by [26], [27]. 

The work in this study is in consonant with the work of Suresh et al. [28] that used hybrid features 
involving GLCM to achieve improved performance in the classification of mass. Also, the findings in this 
study corroborate the works of Fardin and Hassan [29] who combined Gabor and fast GLCM features to 
achieve a more discriminating feature in the classification of very high-resolution remote sensing images. 
This is also applicable in this study; the combination of the Gabor and GLCM features also achieved 
improved discriminating features for the classification of masses in digital mammography. Furthermore, 
based on the suggestion of Khan et al. [18] which stated that the reduction of features in the Gabor filter bank 
is required to reduce time complexity in the classification of mammograms and the submission of Xing and 
Jia [17] which stated that GLCM features also tend to have a high computationally complexity; the 
application of WA-GSA technique was able to achieve a combine optimum features from Gabor and GLCM 
features with reduced computational complexity. Hence, WA-GSA technique achieved improved 
discriminating features which are less computationally expensive with reduced false positive and false 
negative in the classification of masses in digital mammography. 


4. CONCLUSION 

In this study, the WA-GSA was used to fuse and optimize Gabor and GLCM features in the 
classification of masses in digital mammography. The results achieved in this study showed that the proposed 
hybrid feature extraction technique would reduce rate of false positive and false negative diagnosis by 1.61% 
and 3.22% respectively when compared with Gabor technique and 2.68% and 8.60% respectively when 
compared with GLCM technique. Additionally, the result showed that the proposed hybrid feature extraction 
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technique would reduce the computational time by 2079.46 and 112.71 seconds when compared with Gabor 
and GLCM technique respectively. It was evident that the WA-GSA was well matched to some other existing 
conventional textural feature extraction method based on its performance. The WA-GSA features will 
achieve a more accurate and computationally efficient CAD system which will help radiologists’ 
interpretation of mammograms for detection of lesions and classification. The results have demonstrated the 
efficacy and accuracy of the proposed method of helping the radiologist on diagnosing breast cancer. It is 
considered a sufficient method to extract features that can assist in avoiding tumor classification difficulties 
and false-positive reduction. It should be considered in building a truly accurate and computationally 
efficient CAD system which will help radiologists in accurate interpretation of mammograms for detection of 
lesions and classification. Also, it could be adopted in clinical practices for better detection and classification 
of breast cancer. 
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